BBN's Systems for the Chinese-English Sub-task of the NTCIR-9 PatentMT Evaluation
نویسندگان
چکیده
This paper describes the work we conducted for building a statistical machine translation (SMT) system for the ChineseEnglish sub-task of the NTCIR-9 patent machine translation (MT) evaluation [17]. We first applied the various techniques on patent data that we had developed for improving SMT performance on other types of data. Our results show that most of the techniques work on patent document translation as well. Second we made changes to our SMT system training in order to address special characteristics of patent documents. The changes produced additional improvements.
منابع مشابه
System Description of BJTU-NLP SMT for NTCIR-9 PatentMT
This paper presents the overview of statistical machine translation systems that BJTU-NLP developed for the NTCIR-9 Patent Machine Translation Task (NTCIR-9 PatentMT). We compared the performance between phrase-based translation model and factored translation model in our Patent SMT of Chinese to English and English to Japanese. Factored translation model was proposed as an extended phrase-base...
متن کاملZZX_MT: the BeiHang MT System for NTCIR-9 PatentMT Task
In this paper, we describe ZZX_MT machine translation system for the NTCIR-9 Patent Machine Translation Task(PatentMT). We participated in the Chinese-English translation subtask and submit three results, which correspond to three different models or decoding algorithms respectively. Both of the first two are phrase-based SMT approaches integrating the BTG constraint into reordering models, and...
متن کاملOverview of the Patent Machine Translation Task at the NTCIR-9 Workshop
This paper gives an overview of the Patent Machine Translation Task (PatentMT) at NTCIR-9 by describing the test collection, evaluation methods, and evaluation results. We organized three patent machine translation subtasks: Chinese to English, Japanese to English, and English to Japanese. For these subtasks, we provided large-scale test collections, including training data, development data an...
متن کاملAn Improved Patent Machine Translation System Using Adaptive Enhancement for NTCIR-10 PatentMT Task
This paper describes the work that we conducted for the Chinese-English (CE) task of the NTCIR-10 patent machine translation evaluation. We built standard phrase-based and hierarchical phrase-based statistical machine translation (SMT) systems with optimized word segmentation, adaptive language model and improved parameter tuning strategy. Our systems outperform official baselines by approximat...
متن کاملThe RWTH Aachen System for NTCIR-9 PatentMT
This paper describes the statistical machine translation (SMT) systems developed by RWTH Aachen University for the Patent Translation task of the 9th NTCIR Workshop. Both phrase-based and hierarchical SMT systems were trained for the constrained JapaneseEnglish and Chinese-English tasks. Experiments were conducted to compare different training data sets, training methods and optimization criter...
متن کامل